Võ - Scene Grammar Lab

The Scene Grammar Lab explores numerous facets of visual cognition, with a strong focus on visual attention, perception and memory in the context of scene perception. Consequently, the lab’s primary research domains encompass top-down guidance during scene searches, the neural representation and evolution of scene knowledge, and the interplay between action and perception in real-life situations. We employ diverse methodologies in our research, including psychophysics, gaze-contingent and real-world eye tracking both in 2D and 3D, pupillometry, VR and EEG recordings.
Scene Semantics and Syntax
Our visual world is complex, typically adhering to strict rules known as a scene’s “grammar” or scene semantics and syntax, which aid in object recognition and scene navigation. Does this “scene grammar” relate to linguistic grammar? How do violations in object placement affect viewing? Can we detect object-scene inconsistencies peripherally? How do these influence eye movement prior to and during fixation? We've begun testing if the brain encodes scene syntax and semantics like linguistic syntax and semantics. Language processing shows that semantic and syntactic cues elicit different brain responses: N400 indicates semantic violations, while P600 signals inconsistent syntax. Using event-related brain potentials (ERPs), we found a dissociation in semantic and syntactic signals: semantic inconsistencies caused negative deflections in the N300/N400 window, whereas syntactic inconsistencies triggered a late positivity similar to the P600 observed with syntax manipulations (Võ & Wolfe, 2013). Since then, we have used the N400 brain response across various paradigms and age groups as an indicator of semantic object-scene processing (c.f. the collected works of our former PhD Tim Lauer as well as studies by Dejan Draschkow and Laura Maffongelli).
Rapid Scene Understanding
Global and local scene features, along with prior knowledge of likely object locations, help us quickly narrow our search during scene viewing. What initial information guides our search and what is the timeline for this scene processing? I have utilized various experimental methods to investigate “scene guidance. " One approach, the "flash-preview moving-window” paradigm (Castelhano & Henderson, 2007; Võ & Henderson, 2010, 2011), combines brief scene previews with a gaze-contingent moving window. A masked scene preview is quickly shown, followed by a target word, and when the scene reappears, it's viewed through a small moving window. This allows manipulation of the preview information. Adjusting the preview content and duration demonstrates that even an incomplete preview can aid search, depending on individual processing speed (Võ & Schneider, 2010). A mere 50 ms glimpse of a scene can guide search as long as there's enough time to combine prior knowledge with visual input (Võ & Henderson, 2010).
Scene-Related Interactions Between Attention and Memory
When we perceive scenes and interact with them, our attention is strongly guided by memory. Semantic memory in the form of knowledge about the meaning and structure of our surroundings enables us to navigate and understand novel environments with relative ease (Draschkow & Võ, 2017; Võ & Wolfe, 2013). By manipulating scenes in various ways, we can subvert prior expectations to understand the underlying guiding mechanisms of attention. Specifically, we have demonstrated that some objects—we call them anchors—have a particularly strong predictive value, directing our attention toward likely locations of other objects. The efficiency of our eye and body movements relies on this guidance (Boettcher et al., 2018; Helbing et al., 2022). The relationship between memory and attention is, however, not a one-way street: New episodic memory representations are constantly created as a byproduct of natural behavior and are of high behavioral relevance. In fact, searching for an object can even result in a stronger memory representation of that object than memorizing it intentionally—as long as the search process occurs in the context of a real-world scene (search superiority effect; Draschkow et al., 2014; Helbing et al., 2020). In our lab, we investigate these scene-related interactions between long-term memory and attentional guidance to get a comprehensive picture of the cognitive determinants of successful adaptive behavior. Advancing our understanding of the neural underpinnings of both memory-guided attention and attention-dependent memory formation will be crucial to this endeavor.
Related Publications
- Boettcher, S. E. P., Draschkow, D., Dienhart, E., & Võ, M. L.-H. (2018). Anchoring visual search in scenes: Assessing the role of anchor objects on eye movements during visual search. Journal of Vision, 18(13), Article 11. https://doi.org/10.1167/18.13.11 PDF
- Draschkow, D., Võ, M. L.-H. (2017). Scene grammar shapes the way we interact with objects, strengthens memories, and speeds search. Scientific Reports, 7(1), Article 16471. https://doi.org/10.1038/s41598-017-16739-x PDF
- Draschkow, D., Wolfe, J. M., & Võ, M. L.-H. (2014). Seek and you shall remember: Scene semantics interact with visual search to build better memories. Journal of Vision, 14(8), Article 10. https://doi.org/10.1167/14.8.10 PDF
- Helbing, J., Draschkow, D., & Võ, M. L.-H. (2020). Search superiority: Goal-directed attentional allocation creates more reliable incidental identity and location memory than explicit encoding in naturalistic virtual environments. Cognition, 196, Article 104147. https://doi.org/10.1016/j.cognition.2019.104147 PDF
- Helbing, J., Draschkow, D., & Võ, M. L.-H. (2022). Auxiliary scene-context information provided by anchor objects guides attention and locomotion in natural search behavior. Psychological Science, 33(9), 1463–1476. https://doi.org/10.1177/09567976221091838 PDF
- Võ, M. L.-H., & Wolfe, J. M. (2013). The interplay of episodic and semantic memory in guiding repeated search in scenes. Cognition, 126(2), 198–212. https://doi.org/10.1016/j.cognition.2012.09.017 PDF
Lab website: Scene Grammar Lab
Schütz-Bosbach - Body Social Cognition & Action Lab

we run the Body Social Cognition and Action Lab. Our research focuses on relationships between human sensation, cognition, and action both within the individual and in social contexts. In particular, we aim to address questions about the neuro-cognitive basis of an embodied self-representation, sometimes called a "bodily self," arising from sensory and motor signals that accompany nearly every bodily activity. Our main interpretation of the bodily self is in terms of a sense of agency (the experience of acting voluntarily and being in control of its consequences) and a sense of ownership (the sense of oneself as the subject of experience). Moreover, we assume that bodily self-representation is especially driven by interactions with other people who surround us, as, for example, governed by a mirror mechanism. Methodologically, we use techniques from both experimental psychology (e.g., reaction times, detection thresholds) and neurophysiology, such as EEG, fMRI, and TMS.
Interoceptive Inference
The idea of predictive coding, namely that perception is the combination of sensory feedback and prior predictions of sensory outcomes, has only recently been applied to interoception. However, recent theoretical models applied predictive coding to agency and interoceptive states. We are currently engaged in empirically testing the model assumptions to determine whether it provides an accurate account of the construction of internal awareness.
Sense of Agency and Cognitive Control
Sense of agency is the feeling of being in control of one's actions. It is a subtle feeling that we do not consciously think about very often. However, it might be hugely important for engaging with the world in a self-determined way. For example, in psychological disorders such as schizophrenia people are not always able to tell if their actions are self-controlled. We are investigating what factors might determine our sense of agency, and how it can be assessed objectively.
Geyer - MEMVIS Lab (MEMory in VIsual Search)
We use visual search, a central and well-established paradigm in cognitive neuroscience. To illustrate, in search paradigms, ecological situations (e.g., finding a goal-relevant object in a cluttered, resource-limited environment, such as an edible berry among inedible leaves or toxic berries) are re-created in the laboratory by presenting, on a computer screen, an array of stimuli with one stimulus being the ‘target’ and the others are ‘non-target’ (distractor) elements. The set of possible actions is usually restricted to pressing one of several alternative response buttons (e.g., to indicate target presence). Items that differ from the surrounding elements in a certain feature (bottom-up stimulus salience) or are expected by our (healthy human) participants (top-down guidance) attract attention. Importantly, search performance is not ‘a-historic’; rather, perceptual, cognitive, and motor events in the immediate past have a modulating effect on performance on a given trial. We have evidence that such ‘priors’ are accumulated about the target’s relational position in a repeated sensory array of distractor elements (contextual-cueing effect) and subsequently expediting search reaction times (RTs) and fixational and head scanning patterns, accompanied by a series of (for repeated arrays enhanced) lateralized EEG/ERP markers and fMRI sub/cortical networks, including the hippocampus, dorsal frontoparietal, and sensory circuits.
In our lab, we study how statistical learning speeds visual search from a behavioral, neural, and computational standpoint. We use behavioral manipulations and track their effects by an entire series of data-recording/ visualization techniques, ranging from, e.g., psychophysics over eye- and head-tracking up to peripheral physiological recordings and brain imaging methods – such as EEG and fMRI or combined EEG-fMRI. We also use head-mounted immersive virtual reality (VR) to create large-scale visual-search environments that recruit different (eye/ head) orienting systems to approximate statistical learning also in ecologically valid search conditions. Further, we use mathematical models to simulate how participants build long-term memory about regularities of the world in order to make rapid and efficient search decisions. Another aspect of our work relates to motivational factors and how these shape learning-related expectancies. In recent years, we have begun to investigate the mechanisms of experience-driven multisensory, i.e., visual, tactile, and auditory attention.
Team: Thomas Geyer, Artyom Zinchenko, Chengyu Fan, Daniel Weinert, Ananya Mandal, Werner Seitz
Shi - MSense Lab

The MSense Lab explores how the brain synthesizes information from various sensory inputs, like touch and vision. We are mainly focused on the impact of multisensory contexts on our search performance and how we learn to use probabilistic contexts to reduce distractions. Additionally, we examine how different types of context affect our perception of time (see our research topics and publications for more details). To address these inquiries, we employ a range of methods, including behavioral experiments, brain imaging, and computational modeling.
Statistical Learning and Suppression
Statistical learning enables individuals to implicitly recognize and utilize environmental regularities, such as the frequent appearance of distractors in specific locations during visual search tasks. This capability allows the cognitive system to anticipate and suppress potential distractions, thereby enhancing search efficiency. Recent research has debated whether such learned suppression is proactive (occurring before distractor onset), reactive (following distractor detection), or merely a habituation effect. Our research focuses on distinguishing these mechanisms across various scenarios to better understand the dynamics of attentional control.
Bayesian Inference in Perception and Action
Human perception and action are influenced by the brain’s ability to interpret sensory information in the context of prior experiences and environmental regularities. Two notable phenomena exemplifying this are the central tendency effect and serial dependence, both of which can be understood through the lens of Bayesian inference. Our research into these phenomena through Bayesian inference aims to provide insights into how the brain integrates prior experiences with sensory evidence to guide perception and action. Understanding these mechanisms not only elucidates fundamental cognitive processes but also informs the development of models that predict perceptual behavior in uncertain environments.
Lab website: MSense Lab
Bahrami - Crowd Cognition Lab

Humans interact with each other – and soon with AI – to share information and make decisions. Our lab investigates the cognitive and neurobiological basis of interactive decision making. We use behavioural psychological testing, functional and structural brain imaging and psychopharmacology techniques to understand human interactive behaviour.
Human-AI Interaction
Autonomous interactive AI is already present in our lives. This presence will increase very quickly, very soon. This will create a hybrid society in which humans have to invent the new norms of social interaction as new situations arise. Many such situations will not have evolutionary precedents. Take for example, a self-driving VW that is equipped with capacity to coordinate with and access the distributed collective memory of all other VWs on the street. To human drivers, this will be a completely new situation.
Shared responsibility in collective decisions
Contradictory maxims such as “two heads are better than one” and “too many cooks spoil the broth” raise the question WHY people engage in collective decisions. We propose that an overlooked and very important reason to join collectives is sharing responsibility for decision outcomes. Sharing responsibility with others protects individuals from possible negative consequences of difficult and uncertain decisions by reducing regret, punishment and stress.
Soutschek - Motivated Cognition and Decision-Making Laboratory
From which meal we choose for lunch to the decision of how much effort we invest in our goals, our daily life constantly requires us to decide between different courses of action. Our research group investigates the psychological, neural, and computational foundations of value-based decisions in healthy and clinical populations. I am particularly interested in the neural foundations of impulse control, metacognition, mentalization, and mental effort, as well as how they influence decision-making processes. To this end, I combine computational models of decision-making with experimental and neuroscientific methods, including neuroimaging (fMRI, spectroscopy, EEG) and neural interventions (TMS, tDCS, psychopharmacology).
Bahrami - Crowd Cognition Group
Humans interact with each other—and soon with AI—to share information and make decisions. Our lab investigates the cognitive and neurobiological basis of interactive decision-making. We use behavioural psychological testing, functional and structural brain imaging, and psychopharmacology techniques to understand human interactive behaviour.
Lab website: Crowd Cognition Group